Search CORE

85 research outputs found

Utterance generation for transaction dialogues

Author: Hessen Arjan van
Hulstijn Joris
Publication venue: ISCA
Publication date: 01/01/1998
Field of study

This paper discusses the utterance generation module of a spoken dialogue system for transactions. Transactions are interesting because they involve obligations of both parties: the system should provide all relevant information; the user should feel committed to the transaction once it has been concluded. Utterance generation plays a major role in this. The utterance generation module works with prosodically annotated utterance templates. An appropriate template for a given dialogue act is selected by the following parameters: utterance type, body of the template, given information, wanted and new information. Templates respect rules of accenting and deaccenting

CiteSeerX

University of Twente Research Information

TwNC: a Multifaceted Dutch News Corpus

Author: Hessen Arjan van
Hondorp Hendri
Jong Franciska de
Ordelman Roeland
Publication venue: ELRA
Publication date: 01/01/2007
Field of study

This contribution describes the Twente News Corpus (TwNC), a multifaceted corpus for Dutch that is being deployed in a number of NLP research projects among which tracks within the Dutch national research programme MultimediaN, the NWO programme CATCH, and the Dutch-Flemish programme STEVIN.\ud \ud The development of the corpus started in 1998 within a predecessor project DRUID and has currently a size of 530M words. The text part has been built from texts of four different sources: Dutch national newspapers, television subtitles, teleprompter (auto-cues) files, and both manually and automatically generated broadcast news transcripts along with the broadcast news audio. TwNC plays a crucial role in the development and evaluation of a wide range of tools and applications for the domain of multimedia indexing, such as large vocabulary speech recognition, cross-media indexing, cross-language information retrieval etc. Part of the corpus was fed into the Dutch written text corpus in the context of the Dutch-Belgian STEVIN project D-COI that was completed in 2007. The sections below will describe the rationale that was the starting point for the corpus development; it will outline the cross-media linking approach adopted within MultimediaN, and finally provide some facts and figures about the corpus

University of Twente Research Information

InfoLink: analysis of Dutch broadcast news and cross-media browsing

Author: Hessen Arjan van
Jong Franciska de
Morang Jeroen
Ordelman Roeland
Publication venue: IEEE
Publication date: 01/01/2005
Field of study

In this paper, a cross-media browsing demonstrator named InfoLink is described. InfoLink automatically links the content of Dutch broadcast news videos to related information sources in parallel collections containing text and/or video. Automatic segmentation, speech recognition and available meta-data are used to index and link items. The concept is visualised using SMIL-scripts for presenting the streaming broadcast news video and the information links

University of Twente Research Information

Der Einsatz von Sprachtechnologie in Oral-History-Sammlungen

Author: Hessen Arjan van
Jong Franciska de
Scagliola Stef
Publication venue: Metropol Verlag
Publication date: 01/01/2013
Field of study

This book chapter presents an overview of the techniques from the field of automatic speech recognition that can contribute to the enhanced accessibility of online oral history interview collections

EUR Research Repository

Erasmus University Digital Repository

University of Twente Research Information

Dialogues with a talking face for web-based services and transactions

Author: Hondorp Hendri
Hulstijn Joris
Nijholt Anton
Ruttkay Z.M.
van den Berk Mathieu
van Hessen Arjan
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/1999
Field of study

In this paper we discuss our research on interactions in a virtual theatre that has been built using VRML and therefore can be accessed through Web pages. In\ud the virtual environment we employ several agents. The virtual theatre allows navigation input through keyboard and mouse, but there is also a navigation\ud agent which listens to typed input and spoken commands. Feedback of the system is given using speech synthesis. We also have an information agent which allows a natural language dialogue with the system where the input is keyboard-driven and the output is both with tables as with template driven natural language generation. In development are several talking faces for the different agents in the virtual world. At this moment an avatar with a cartoon-like talking face driven by a text-to-speech synthesizer can provide users with information about performances in the theatre

University of Twente Research Information

Lexicon optimization for Dutch speech recognition in spoken document retrieval

Author: Hessen Arjan van
Jong Franciska de
Ordelman Roeland
Publication venue
Publication date: 01/01/2001
Field of study

In this paper, ongoing work concerning the language modelling and lexicon optimization of a Dutch speech recognition system for Spoken Document Retrieval is described: the collection and normalization of a training data set and the optimization of our recognition lexicon. Effects on lexical coverage of the amount of training data, of decompounding compound words and of different selection methods for proper names and acronyms are discussed

CiteSeerX

University of Twente Research Information

Dealing with Phrase Level Co-Articulation (PLC) in speech recognition: A first approach

Author: Hessen Arjan J. van
Leeuwen David A. van
Ordelman Roeland J.F.
Publication venue: ESCA
Publication date: 01/01/1999
Field of study

Whereas nowadays within-word co-articulation effects are usually sufficiently dealt with in automatic speech recognition, this is not always the case with phrase level co-articulation effects (PLC). This paper describes a first approach in dealing with phrase level co-articulation by applying these rules on the reference transcripts used for training our recogniser and by adding a set of temporary PLC phones that later on will be mapped on the original phones. In fact we temporarily break down acoustic context into a general and a PLC context. With this method, more robust models could be trained because phones that are confused due to PLC effects like for example /v/-/f/ and /z/-/s/, receive their own models. A first attempt to apply this method is described

University of Twente Research Information

Croatian Memories : speech, meaning and emotions in a collection of interviews on experiences of war and trauma

Author: Hessen Arjan van
Jong Franciska de
Petrovic Tanja
Scagliola Stef
Publication venue: ELRA
Publication date: 01/05/2014
Field of study

In this contribution we describe a collection of approximately 400 video interviews recorded in the context of the project Croatian Memories (CroMe) with the objective of documenting personal war-related experiences. The value of this type of sources is threefold: they contain information that is missing in written sources, they can contribute to the process of reconciliation, and they provide a basis for reuse of data in disciplines with an interest in narrative data. The CroMe collection is not primarily designed as a linguistic corpus, but is the result of an archival effort to collect so-called oral history data. For researchers in the fields of natural language processing and speech analysis this type of life-stories may function as an 'objet trouvé' containing real-life language data that can prove to be useful for the purpose of modelling specific aspects of human expression and communication

CiteSeerX

EUR Research Repository

Erasmus University Digital Repository

University of Twente Research Information

"Croatian Memories": eine Interviewsammlung mit personlichen Berichten uber Krieg und Traumata.

Author: Hessen A.J. (Arjan) van
Jong F.M.G. (Franciska) de
Scagliola S.I. (Stef)
Publication venue
Publication date: 01/01/2013
Field of study

This book chapter summarizes the aims and methodology underlying the collection oral history interviews that has been generated in the context of the project Croatian Memories. It includes an overview of facts and figures for the contents of the collection and the interviewees

Erasmus University Digital Repository